Robust Error Detection: A Hybrid Approach Combining Unsupervised Error Detection and Linguistic Knowledge

نویسنده

  • Johnny Bigert
چکیده

This article presents a robust probabilistic method for the detection of context-sensitive spelling errors. The algorithm identifies lessfrequent grammatical constructions and attempts to transform them into more-frequent constructions while retaining similar syntactic structure. If the transformations result in lowfrequency constructions, the text is likely to contain an error. A first unsupervised approach uses only information derived from a part-ofspeech tagged corpus. This experiment shows a good error detection capacity but also a high rate of false alarms, in many cases due to phrase and clause boundaries. In a second approach, we combine the first method with robust phrase and clause recognition to avoid many of the false alarms in the first experiment. A comparative evaluation of the experiments shows that the introduction of linguistic knowledge dramatically increases the precision of the error detection method.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combining of Magnitude and Direction of Change Indices to Unsupervised Change Detection in Multitemporal Multispectral Remote Sensing Images

In remote sensing, image-based change detection techniques, analyze two images acquired over the same area at different times t1 and t2 to identify the changes occurred on the Earth's surface. Change detection approaches are mainly categorized as supervised and unsupervised. Generating the change index is a key step for change detection in multi-temporal remote sensing images. Unsupervised chan...

متن کامل

Automated Error Detection in Digitized Cultural Heritage Documents

The work reported in this paper aims at performance optimization in the digitization of documents pertaining to the cultural heritage domain. A hybrid method is proposed, combining statistical classification algorithms and linguistic knowledge to automatize post-OCR error detection and correction. The current paper deals with the integration of linguistic modules and their impact on error

متن کامل

Robust Fault Detection on Boiler-turbine Unit Actuators Using Dynamic Neural Networks

Due to the important role of the boiler-turbine units in industries and electricity generation, it is important to diagnose different types of faults in different parts of boiler-turbine system. Different parts of a boiler-turbine system like the sensor or actuator or plant can be affected by various types of faults. In this paper, the effects of the occurrence of faults on the actuators are in...

متن کامل

An approach to fault detection and correction in design of systems using of Turbo ‎codes‎

We present an approach to design of fault tolerant computing systems. In this paper, a technique is employed that enable the combination of several codes, in order to obtain flexibility in the design of error correcting codes. Code combining techniques are very effective, which one of these codes are turbo codes. The Algorithm-based fault tolerance techniques that to detect errors rely on the c...

متن کامل

A Robust Strucutural Fingerprint Restoration

Fast and accurate ridge detection in fingerprints is essential to each AFIS (Automatic Fingerprint Identification System). Smudged furrows and cut ridges in the image of a finger print are major problems in any AFIS. This paper investigates a new online ridge detection method that reduces the complexity and costs associated with the fingerprint identification procedure. The noise in fingerprint...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002